Radar-based gesture enhancement for voice interfaces

ABSTRACT

This document describes techniques and systems that enable radar-based gesture enhancement for voice interfaces. The techniques and systems use a radar field to accurately determine three-dimensional (3D) gestures that can be used instead of, or in combination with, a voice interface to enhance interactions with voice-controllable electronic devices. These techniques allow the user to make 3D gestures from a distance to provide a voice input trigger (e.g., a “listen” gesture), interrupt and correct inaccurate actions by the voice interface, and make natural and precise adjustments to functions controlled by voice commands.

RELATED APPLICATION

The present application is a continuation of and claims priority to U.S.patent application Ser. No. 16/108,815, filed Aug. 22, 2018, the entiredisclosure of which is hereby incorporated by reference.

BACKGROUND

Smartphones are not used just to communicate and shop. Increasingly,smartphones are used to control and interact with our environmentthrough smart-home or home-automation systems. Through these systems,users can play music or other audio, turn lights on and off, adjustthermostats and appliances, and control many other functions. Users ofsmart-home systems often use voice commands to interact withapplications on their electronic devices, especially when touch inputsare difficult or inconvenient, such as when a room is dark or the user'ssmartphone is out of reach. For example, many smartphones and smart-homesystems include a voice interface (sometimes called a voice assistant)that listens for its name or other activation word and, once activated,can perform tasks based on voice commands. Voice interfaces usespeech-recognition techniques to enable simple voice commands, such asturning lights on or off, adjusting audio volume levels, and so forth.Using a voice interface to interact with an electronic device to performmore-complex tasks, however, can be inconvenient, ineffective, andfrustrating.

In particular, as more and more devices become able to receive voicecommands, it can be a challenge for users to make complex,device-specific voice commands via a voice interface. In part, thesedifficulties arise because human conversation and communication is a mixof verbal and nonverbal communication, but a voice interface can onlyunderstand the verbal part of voice commands. For example, a command toturn on the lights may result in every light being turned on when theuser meant to turn on only a reading lamp. A command to turn the musicup can be similarly misunderstood by the voice interface unless the useradds details to explain how much to turn the music up or engages in aback-and-forth dialog with the voice interface until the volume iscorrect. Additionally, once the voice interface starts talking orperforming a task improperly, it can be difficult to interrupt andcorrect the voice interface. Further, to be able to respond to voicecommands, the voice interface must be listening nearly all the time,which can increase power consumption and lead to unintentional commandsor unexpected interruptions by the voice interface. These and otherproblems can lead to frustration and inaccurate or incomplete input.Thus, users may not realize the full potential of their electronicdevices because of the limitations of voice interfaces.

SUMMARY

This document describes techniques and systems that enable radar-basedgesture enhancement for voice interfaces. The techniques and systems usea radar field to accurately determine three-dimensional (3D) gesturesthat can be used instead of, or in combination with, a voice interfaceto enhance interactions with voice-controllable electronic devices.These techniques allow the user to make 3D gestures from a distance toprovide a voice input trigger (e.g., a “listen” gesture), interrupt andcorrect inaccurate actions by the voice interface, and make natural andprecise adjustments to functions controlled by voice commands.

Aspects described below include a smartphone comprising a microphone, aradar system, one or more computer processors, and one or morecomputer-readable media. The radar system is implemented at leastpartially in hardware and provides a radar field. The radar system alsosenses reflections from an object in the radar field and analyzes thereflections from the object in the radar field. The radar system furtherprovides, based on the analysis of the reflections, radar data. The oneor more computer-readable media include stored instructions that can beexecuted by the one or more computer processors to implement aradar-based application. The radar-based application maintains themicrophone in a non-operational mode. The radar-based application alsodetects, based on the radar data, a gesture by the object in the radarfield and determines, based on the radar data, that the gesture is avoice input trigger. In response to determining that the gesture is thevoice input trigger, the radar-based application causes the microphoneto enter an operational mode.

Aspects described below also include a system comprising an electronicdevice that includes a microphone, a radar system, one or more computerprocessors, and one or more computer-readable media. The radar system isimplemented at least partially in hardware and provides a radar field.The radar system also senses reflections from an object in the radarfield and analyzes the reflections from the object in the radar field.The radar system further provides, based on the analysis of thereflections, radar data. The one or more computer-readable media includestored instructions that can be executed by the one or more computerprocessors to implement an interaction manager. The interaction managergenerates a virtual map of an environment. The virtual map identifies alocation and a type of one or more devices in the environment that canbe interacted with via the interaction manager. The interaction managerreceives, at a first time, a voice command directed to at least one ofthe one or more identified devices, and the voice command includes atleast the type of the at least one device. At a second time that islater than, or at approximately a same time as, the first time, theinteraction manager determines, based on the radar data, athree-dimensional (3D) gesture. The 3D gesture corresponds to asub-command that is related to the voice command. In response to thevoice command and the 3D gesture, the interaction manager causes the atleast one of the one or more devices to perform an action thatcorresponds to the voice command and the sub-command.

Aspects described below also include a method, implemented in anelectronic device that includes a radar system, a radar-basedapplication, and a microphone. The method comprises providing, by theradar system, a radar field. The method also includes sensing, by theradar system, reflections from an object in the radar field andanalyzing the reflections from the object in the radar field. The methodfurther includes providing, based on the analysis of the reflections,radar data and maintaining, by the radar-based application, themicrophone in a non-operational mode. The method also includesdetecting, based on the radar data, a gesture by the object in the radarfield and determining, based on the radar data, that the gesture is avoice input trigger. In response to determining that the gesture is thevoice input trigger, the radar-based application causes the microphoneto enter an operational mode.

Aspects described below also include a system comprising an electronicdevice that includes, or is associated with, a microphone and means forproviding a radar field and detecting a gesture by an object in theradar field. The system also includes means for maintaining themicrophone in a non-operational mode and determining that the gesture bythe object in the radar field is a voice input trigger. The system alsoincludes means for, responsive to determining the voice input trigger,causing the microphone to enter an operational mode.

This summary is provided to introduce simplified concepts concerningradar-based gesture enhancement for voice interfaces, which is furtherdescribed below in the Detailed Description and Drawings. This summaryis not intended to identify essential features of the claimed subjectmatter, nor is it intended for use in determining the scope of theclaimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

The details of one or more aspects of radar-based gesture enhancementfor voice interfaces are described in this document with reference tothe following drawings. The same numbers are used throughout thedrawings to reference like features and components:

FIG. 1 illustrates an example environment in which techniques enablingradar-based gesture enhancement for voice interfaces can be implemented.

FIG. 2 illustrates an example implementation of the electronic device ofFIG. 1 that includes a radar system and can implement radar-basedgesture enhancement for voice interfaces.

FIG. 3 illustrates an example implementation of the radar system of FIG.2.

FIG. 4 illustrates example arrangements of receiving antenna elementsfor the radar system of FIG. 3.

FIG. 5 illustrates additional details of an example implementation ofthe radar system of FIG. 2.

FIG. 6 illustrates an example scheme that can be implemented by theradar system of FIG. 2.

FIG. 7 illustrates another example environment in which techniquesenabling radar-based gesture enhancement for voice interfaces can beimplemented.

FIGS. 8 and 9 depict an example method enabling radar-based gestureenhancement for voice interfaces.

FIG. 10 illustrates an example implementation of an electronic devicethat can implement additional details of the method of FIGS. 8 and 9.

FIGS. 11 and 12 illustrate additional details of the method of FIGS. 8and 9.

FIG. 13 illustrates various components of an example computing systemthat can be implemented as any type of client, server, and/or electronicdevice as described with reference to FIGS. 1-12 to implement, or inwhich techniques may be implemented that enable, radar-based gestureenhancement for voice interfaces.

DETAILED DESCRIPTION

Overview

This document describes techniques and systems that enable radar-basedgesture enhancement for voice interfaces. As noted, it can bechallenging to give complex, device-specific voice commands via a voiceinterface because humans communicate using a mix of verbal and nonverbalcommunication, but the voice interface only understands the verbalportion. Thus, users may not realize the full potential of theirsmart-home features because of the limitations of voice interfaces. Thetechniques and systems employ a radar system to accurately determinethree-dimensional (3D) gestures (e.g., a gesture that comprises one ormore movements, in any direction, within a 3D space illuminated by aradar field, as described in this document). The 3D gestures can be usedinstead of, or in combination with, the voice interface to enhanceinteractions with voice-controllable devices. Because the user can make3D gestures from a distance, the device providing the voice interfacecan remain in a non-operational (or non-listening) mode until the userprovides a voice input trigger (e.g., a “listen” gesture), which cansave power, increase privacy, and reduce unintentional commands.

Additionally, when the user wants to make an analog change, such asadjusting a volume setting for a music player or changing a lightinglevel, voice commands by themselves often allow little flexibility. Forexample, the user may say “turn lights down” or “increase volume” toadjust these functions. Additional voice commands may then be necessaryto fine-tune the adjustments. Using the radar system with the describedtechniques, the user can employ voice commands, along with gestures, fortasks such as adjusting typically analog controls, like light level ormusic volume. A voice command, along with an intuitive 3D gesturespecifying how much to lower the lights or increase the volume, allowsthe user to interact with the electronic device in a simple and naturalway that more-closely matches typical human communication style. Forexample, the user may say “lower lights” while lowering a hand or say“increase volume” while making a gesture that has a motion of turning avolume dial.

Further, some voice interfaces may give audio responses to confirm thecommand or indicate that the command is being performed. Once the voiceinterface starts responding or is performing a task improperly, it canbe difficult to interrupt and correct the voice interface. Using thedescribed techniques, the user can use 3D gestures to interrupt andcorrect the voice interface, allowing the user to be more effective.Thus, the described techniques and systems can improve the quality andeffectiveness of the user's experience and thereby increase the user'sefficiency, work flow, and enjoyment.

Consider an electronic device that includes a radar-based applicationwith a voice interface that can be used to control appliances and otherdevices in a home. For example, the radar-based application may allow auser to control a thermostat or home security system, or to makereal-time adjustments to a volume of an entertainment system or abrightness level of dimmable lights in a room. In this example, theelectronic device may include various cameras and microphones to enablethe voice interface. A conventional voice interface can receive voicecommands and perform the actions associated with the commands. Thus, theuser may give voice commands that control simple functions, such asturning lights on or off, adjusting audio volume levels, and so forth.The conventional voice interface, however, is typically less effectivefor complex commands and fine-tuning adjustments in an analog manner.For example, when a movie is over and the credits are playing, the usermay wish to turn up the lighting in a home-theater room and turn downthe volume of the home-theater speakers. To do so, the user givesseveral commands, possibly beginning with a listen prompt to alert thevoice interface that voice commands are about to be given. Then, theuser issues the relevant voice commands, which may include multipleiterations (e.g., “lights up sixty percent” then “lights down twentypercent” and then “volume down fifty percent”). Even after adjusting thelights and volume up and down, the user can still be unsatisfied withthe results and have to resort to manual controls. Consistentlydifficult or inconvenient interactions with the voice interface canreduce efficiency and the quality of the user's experience with thevoice interface, or even reduce the likelihood that the user will usethe voice interface.

Contrast these conventional techniques with the systems and techniquesdescribed in this document, which can improve efficiency and usabilityin several areas. For instance, in the example above, the user is tryingto make adjustments to light and audio volume levels, for which ananalog adjustment, such as a rheostat on a dimmer switch or a volumedial on a stereo, would be a natural and intuitive control. In thissituation, the electronic device may include a radar system that canprovide a radar field that extends into an area around the device (e.g.,a five- or eight-foot radius around the device). The radar sensors canuse radar signals reflected from objects that enter the radar field todetect gestures made by the user, in combination with a voice command,to enable the user to fine-tune the light and audio volume levels.

In this way, the described techniques and systems allow efficient andnatural interaction with voice-controlled devices. The user can enjoythe advantages and convenience of voice control, while using 3D gesturesto provide additional flexibility and enhanced functionality. This canimprove efficiency and reduce user frustration, such as having to adjustand re-adjust various devices to achieve the desired result, whichincreases the quality of the user experience. Further, power consumptionof the radar system can be substantially less than some conventionaltechniques that may use an always-on microphone to enable the voiceinterface.

These are but a few examples of how the techniques and devices describedherein may be used to allow users to interact with devices using both avoice interface and 3D gestures. Other examples and implementations ofwhich are described throughout this document. The document now turns toan example environment, after which example systems, apparatuses,methods, and components are described.

Operating Environment

FIG. 1 illustrates an example environment 100 in which techniquesenabling radar-based gesture enhancement for voice interfaces can beimplemented. The example environment 100 includes an electronic device102, which includes, or is associated with, a radar system 104, aradar-based application 106, and a microphone 108. In the exampleenvironment 100, the radar system 104 provides a radar field 110 bytransmitting one or more radar signals or waveforms as described belowwith reference to FIGS. 3-6. The radar field 110 is a volume of spacefrom which the radar system 104 can detect reflections of the radarsignals and waveforms (e.g., radar signals and waveforms reflected fromobjects in the volume of space). The radar system 104 also enables theelectronic device 102, in this case a smartphone 102-1, to sense andanalyze reflections from an object 112 in the radar field 110.

The object 112 may be any of a variety of objects that the radar system104 can sense and analyze reflections from, such as wood, plastic,metal, fabric, or human body parts (e.g., a hand of a user of theelectronic device 102). As shown in FIG. 1, the object 112 is a personor a user of the smartphone 102-1 (person 112 or user 112). Based on theanalysis of the reflections, the radar system 104 can provide radar datathat includes various types of information associated with the radarfield 110 and the reflections from the object 112, as described belowwith reference to FIGS. 3-6 (e.g., the radar system 104 can pass theradar data to other entities, such as the radar-based application 106).

It should be noted that the radar data may be continuously orperiodically provided over time, based on the sensed and analyzedreflections from the object 112 in the radar field 110. A position ofthe object 112 can change over time (e.g., the object 112 may movewithin the radar field 110) and the radar data can thus vary over timecorresponding to the changed positions, reflections, and analyses.Because the radar data may vary over time, the radar system 104 mayprovide radar data that includes one or more subsets of radar data thatcorrespond to different periods of time. For example, the radar system104 may provide a first subset of the radar data corresponding to afirst time-period, a second subset of the radar data corresponding to asecond time-period, and so forth.

The radar-based application 106 may be any of a variety of radar-basedapplications that can receive voice commands or instructions (e.g.,through the microphone 108), which can be used to interact with theelectronic device 102 or with a variety of other devices, such as homeappliances, security systems, entertainment systems, lights (e.g., alamp 114), or an internet-of-things (JOT) device. In someimplementations, the radar-based application 106 is, or includes, avoice assistant (e.g., a system-specific voice assistant associated witha particular brand or type of home-automation system or a generic voiceassistant that can work with a variety of home-automation systems anddevices).

The radar-based application 106 can also control the microphone 108,such as by maintaining the microphone 108 in a non-operational mode. Thenon-operational mode can be a mode in which the microphone 108 ispowered off and cannot receive, analyze, or record audio input. In otherimplementations, the non-operational mode may be a mode in which themicrophone 108 is connected to power and can receive audio input, butcannot be used to record, analyze, or otherwise act on the audio input.The powered non-operational mode may be achieved using various methods(e.g., a software or firmware control signal or a hardware control, suchas a switch) that prohibit data transfer from the microphone 108 tomemory devices, processors, or other devices. The powerednon-operational mode may be used with electronic devices 102 that use acombination component that serves as both a speaker and a microphone toenable these electronic devices 102 to produce sound when in thenon-operational mode. The powered non-operational mode can also reducethe likelihood that the microphone 108 misses the beginning of an audiocommand if there is a delay between applying power and being able toreceive audio input.

The radar-based application 106 may also include a 3D gesture module116, which can store both information related to determining 3D gesturesbased on the radar data and information related to actions thatcorrespond to the 3D gestures. Based on the radar data, the radar-basedapplication 106 can detect the 3D gesture by the user 112 and determinethat the gesture is a voice input trigger (e.g., using the 3D gesturemodule 116). The voice input trigger is an indication to the radar-basedapplication 106 that it may receive voice input (or voice commands). Inresponse to determining that the 3D gesture is the voice input trigger,the radar-based application 106 causes the microphone 108 to enter anoperational mode that enables the microphone 108 to receive, and act on,voice commands or other audio input. In some implementations, theradar-based application 106 can also cause the microphone 108 to enter,or re-enter, the non-operational mode when the radar-based application106 does not receive a voice or other audio input within a thresholdtime of receiving the voice input trigger. Because the radar-basedapplication 106 can maintain the microphone 108 in a non-operationalmode until the voice input trigger is received, users may have increasedprivacy and a frequency of inadvertent or unintentional voice commandsmay be reduced.

A 3D gesture can be any of a variety of gestures, including a scrollinggesture made by moving a hand above the electronic device 102 along ahorizontal dimension (e.g., from a left side of the electronic device102 to a right side of the electronic device 102), a waving gesture madeby the user's arm rotating about an elbow, a pushing gesture made bymoving the user's hand above the electronic device 102 along a verticaldimension (e.g., from a bottom side of the electronic device 102 to atop side of the electronic device 102). Other types of 3D gestures ormotions may also be made, such as a reaching gesture made by moving theuser's hand towards the electronic device 102, a knob-turning gesturemade by curling fingers of the user's hand to grip an imaginary doorknob and rotating in a clockwise or counter-clockwise fashion to mimican action of turning the imaginary door knob, and a spindle-twistinggesture made by rubbing a thumb and at least one other finger together.Each of these example gesture types may be detected by the radar system104. Upon detecting each of these gestures, the electronic device 102may perform an action, such as provide a voice input trigger, activateor control a home-automation system, activate one or more sensors, openan application, control an entertainment system, light, or appliance,pin content to a screen, silence an alarm, or control a user interface.In this way, the radar system 104 provides touch-free control of theelectronic device 102.

In FIG. 1, the voice input trigger is a 3D gesture in which the user 112drops and extends an arm, as shown by an arrow 118 and a dashed-linedepiction of the arm. As described with reference to FIGS. 3-6, theradar system 104 can use the radar field 110 to sense and analyzereflections from objects in the radar field 110 in ways that enablehigh, or increased, resolution and accuracy for both gesture recognitionand body posture. Thus, the voice input trigger gesture may take formsother than that shown by the arrow 118, such as a micro-gesture or amovement of the user 112 to within a threshold distance of theelectronic device 102. Further, the voice input trigger gesture may be apredefined 3D gesture, a 3D gesture selected from a list, or a customgesture (e.g., the user may interact with the radar-based application106 and the radar system 104 to define a unique gesture, or combinationof gestures, as the voice input trigger). Unless indicated otherwise bya particular context, increased accuracy refers to an increased degreeof refinement, an increased conformity to truth, or both the increaseddegree of refinement and the increased conformity to truth.

In some implementations, the radar-based application 106 includes, or isin communication with, a voice interface module 120. The voice interfacemodule 120 can receive the voice input (e.g., the voice command),determine an action that corresponds to the voice input, and cause theelectronic device 102 to perform the corresponding action. In someimplementations, the voice interface module 120 may be used to maintainthe microphone 108 in the non-operational mode. As shown in FIG. 1, thevoice interface module 120 is part of the radar-based application 106,but the voice interface module 120 can be a separate entity that is partof, or separate from, the electronic device 102. In this way, theradar-based application 106 can use both the voice input and a gestureinput to interact with the electronic device 102 or with another device.

Consider two examples. In the first example, once the microphone 108enters the operational mode, the radar-based application 106 (e.g., thevoice interface module 120) receives the voice command that is directedto a device that can be interacted with via the radar-based application106. Once the voice command is received, the radar-based application 106receives a 3D gesture that specifies a particular device. In this case,the voice command is any of a variety of commands, such as “turn lightsdown” or “turn speakers up” and the 3D gesture is a 3D gesture, such asa pointing gesture, that specifies which lights or a specific speaker.The radar-based application 106 can distinguish between devices in avariety of manners, such as using a virtual map of an environment inwhich the radar-based application 106 is operating. The virtual map maybe generated using techniques such as those described below with respectto FIG. 7. In this way the 3D gesture can be used, along with the voicecommand, to turn on a particular light, adjust a particular speaker in ahome-theater system, and so forth.

In the second example (again, once the microphone 108 enters theoperational mode), the radar-based application 106, or the voiceinterface module 120, receives a voice command to adjust a function of adevice that can be interacted with via the radar-based application 106.Once the voice command is received, the radar-based application 106receives a 3D gesture that specifies an amount of adjustment to thefunction of the device. As in the first example, the voice command isany of a variety of commands, such as “turn lights down” or “turnspeakers up” and the 3D gesture is a 3D gesture that adjusts the lightsor speakers, such as downward hand gesture or a gesture that has amotion of turning a volume dial. Thus, the electronic device 102, alongwith the radar system 104 and the radar-based application 106, worktogether to enable users of voice interfaces to efficiently andconveniently use both voice commands and 3D gestures to make adjustmentsto functions of devices that can be interacted with via a voiceinterface.

In some implementations, including implementations of the examplesdescribed above, the 3D gesture is effective if received within athreshold time of, or approximately simultaneously with, receiving thevoice command. The threshold time may be any appropriate time (e.g.,0.5, 1.5, or 2.5 seconds), and may be predefined, user-selectable, ordetermined via a machine learning module that is included, or associatedwith, the radar system 104 or the radar-based application 106.

In still other implementations, including implementations of theexamples described above, the radar-based application 106 or the voiceinterface module 120 (again, once the microphone 108 enters theoperational mode) may provide an audio message to the user. For example,in response to a gesture, voice command, or other input, the audiomessage may be used to request confirmation of the command. In anotherexample, the gesture, voice command, or other input may include aninstruction for the radar-based application 106 or the voice interfacemodule 120 to provide the audio message. In some cases, however, theaudio message may be incorrect, or the user may reconsider the input.The user may then use another 3D gesture to stop the audio message. Theradar-based application 106 can receive the other 3D gesture, whichcorresponds to a command to cease providing the audio message. Inresponse to receiving the other 3D gesture, the radar-based application106 (or the voice interaction module 120) stops providing the audiomessage. The other 3D gesture may be effective if received within aduration of the audio message (e.g., while the audio message isplaying), or before the audio message begins (e.g., between the inputthat causes the audio message and the beginning of the audio message).After the audio message is stopped, the microphone 108 may remain in theoperational mode or, depending on the settings of the radar-basedapplication 106, enter the non-operational mode.

In more detail, consider FIG. 2, which illustrates an exampleimplementation 200 of the electronic device 102 (including the radarsystem 104, the radar-based application 106, and the microphone 108)that can implement radar-based gesture enhancement for voice interfaces.The electronic device 102 of FIG. 2 is illustrated with a variety ofexample devices, including a smartphone 102-1, a tablet 102-2, a laptop102-3, a desktop computer 102-4, a computing watch 102-5, computingspectacles 102-6, a gaming system 102-7, a home-automation and controlsystem 102-8, and a vehicle 102-9. The electronic device 102 can alsoinclude other devices, such as televisions, entertainment systems, audiosystems, drones, track pads, drawing pads, netbooks, e-readers, homesecurity systems, and other home appliances. Note that the electronicdevice 102 can be wearable, non-wearable but mobile, or relativelyimmobile (e.g., desktops and appliances).

The electronic device 102 also includes one or more computer processors202 and one or more computer-readable media 204, which includes memorymedia and storage media. Applications and/or an operating system (notshown) implemented as computer-readable instructions on thecomputer-readable media 204 can be executed by the computer processors202 to provide some of the functionalities described herein. Theelectronic device 102 may also include a network interface 206. Theelectronic device 102 can use the network interface 206 forcommunicating data over wired, wireless, or optical networks. By way ofexample and not limitation, the network interface 206 may communicatedata over a local-area-network (LAN), a wireless local-area-network(WLAN), a personal-area-network (PAN), a wide-area-network (WAN), anintranet, the Internet, a peer-to-peer network, point-to-point network,or a mesh network.

Various implementations of the radar system 104 can include aSystem-on-Chip (SoC), one or more Integrated Circuits (ICs), a processorwith embedded processor instructions or configured to access processorinstructions stored in memory, hardware with embedded firmware, aprinted circuit board with various hardware components, or anycombination thereof. The radar system 104 operates as a monostatic radarby transmitting and receiving its own radar signals. In someimplementations, the radar system 104 may also cooperate with otherradar systems 104 that are within an external environment to implement abistatic radar, a multistatic radar, or a network radar. Constraints orlimitations of the electronic device 102, however, may impact a designof the radar system 104. The electronic device 102, for example, mayhave limited power available to operate the radar, limited computationalcapability, size constraints, layout restrictions, an exterior housingthat attenuates or distorts radar signals, and so forth. The radarsystem 104 includes several features that enable advanced radarfunctionality and high performance to be realized in the presence ofthese constraints, as further described below with respect to FIG. 3.Note that in FIG. 2, the radar system 104 is illustrated as part of theelectronic device 102. In other implementations, the radar system 104may be separate or remote from the electronic device 102.

These and other capabilities and configurations, as well as ways inwhich entities of FIG. 1 act and interact, are set forth in greaterdetail below. These entities may be further divided, combined, and soon. The environment 100 of FIG. 1 and the detailed illustrations of FIG.2 through FIG. 12 illustrate some of many possible environments anddevices capable of employing the described techniques.

FIG. 3 illustrates an example implementation 300 of the radar system 104that can be used to enable radar-based gesture enhancement for voiceinterfaces. In the example 300, the radar system 104 includes at leastone of each of the following components: a communication interface 302,an antenna array 304, a transceiver 306, a processor 308, and a systemmedia 310 (e.g., one or more computer-readable storage media). Theprocessor 308 can be implemented as a digital signal processor, acontroller, an application processor, another processor (e.g., thecomputer processor 202 of the electronic device 102) or some combinationthereof. The system media 310, which may be included within, or beseparate from, the computer-readable media 204 of the electronic device102, includes one or more of the following modules: an attenuationmitigator 314, a digital beamformer 316, an angle estimator 318, or apower manager 320. These modules can compensate for, or mitigate theeffects of, integrating the radar system 104 within the electronicdevice 102, thereby enabling the radar system 104 to recognize small orcomplex gestures, distinguish between different orientations of theuser, continuously monitor an external environment, or realize a targetfalse alarm rate. With these features, the radar system 104 can beimplemented within a variety of different devices, such as the devicesillustrated in FIG. 2.

Using the communication interface 302, the radar system 104 can provideradar data to the radar-based application 106. The communicationinterface 302 may be a wireless or wired interface based on the radarsystem 104 being implemented separate from, or integrated within, theelectronic device 102. Depending on the application, the radar data mayinclude raw or minimally processed data, in-phase and quadrature (I/Q)data, range-Doppler data, processed data including target locationinformation (e.g., range, azimuth, elevation), clutter map data, and soforth. Generally, the radar data contains information that is usable bythe radar-based application 106 for radar-based gesture enhancement forvoice interfaces.

The antenna array 304 includes at least one transmitting antenna element(not shown) and at least two receiving antenna elements (as shown inFIG. 4). In some cases, the antenna array 304 may include multipletransmitting antenna elements to implement a multiple-inputmultiple-output (MIMO) radar capable of transmitting multiple distinctwaveforms at a time (e.g., a different waveform per transmitting antennaelement). The use of multiple waveforms can increase a measurementaccuracy of the radar system 104. The receiving antenna elements can bepositioned in a one-dimensional shape (e.g., a line) or atwo-dimensional shape for implementations that include three or morereceiving antenna elements. The one-dimensional shape enables the radarsystem 104 to measure one angular dimension (e.g., an azimuth or anelevation) while the two-dimensional shape enables two angulardimensions to be measured (e.g., both azimuth and elevation). Exampletwo-dimensional arrangements of the receiving antenna elements arefurther described with respect to FIG. 4.

FIG. 4 illustrates example arrangements 400 of receiving antennaelements 402. If the antenna array 304 includes at least four receivingantenna elements 402, for example, the receiving antenna elements 402can be arranged in a rectangular arrangement 404-1 as depicted in themiddle of FIG. 4. Alternatively, a triangular arrangement 404-2 or anL-shape arrangement 404-3 may be used if the antenna array 304 includesat least three receiving antenna elements 402.

Due to a size or layout constraint of the electronic device 102, anelement spacing between the receiving antenna elements 402 or a quantityof the receiving antenna elements 402 may not be ideal for the angles atwhich the radar system 104 is to monitor. In particular, the elementspacing may cause angular ambiguities to be present that make itchallenging for conventional radars to estimate an angular position of atarget. Conventional radars may therefore limit a field of view (e.g.,angles that are to be monitored) to avoid an ambiguous zone, which hasthe angular ambiguities, and thereby reduce false detections. Forexample, conventional radars may limit the field of view to anglesbetween approximately −45 degrees to 45 degrees to avoid angularambiguities that occur using a wavelength of 5 millimeters (mm) and anelement spacing of 3.5 mm (e.g., the element spacing being 70% of thewavelength). Consequently, the conventional radar may be unable todetect targets that are beyond the 45-degree limits of the field ofview. In contrast, the radar system 104 includes the digital beamformer316 and the angle estimator 318, which resolve the angular ambiguitiesand enable the radar system 104 to monitor angles beyond the 45-degreelimit, such as angles between approximately −90 degrees to 90 degrees,or up to approximately −180 degrees and 180 degrees. These angularranges can be applied across one or more directions (e.g., azimuthand/or elevation). Accordingly, the radar system 104 can realize lowfalse-alarm rates for a variety of different antenna array designs,including element spacings that are less than, greater than, or equal tohalf a center wavelength of the radar signal.

Using the antenna array 304, the radar system 104 can form beams thatare steered or un-steered, wide or narrow, or shaped (e.g., as ahemisphere, cube, fan, cone, or cylinder). As an example, the one ormore transmitting antenna elements (not shown) may have an un-steeredomnidirectional radiation pattern or may be able to produce a wide beam,such as the wide transmit beam 406. Either of these techniques enablethe radar system 104 to illuminate a large volume of space. To achievetarget angular accuracies and angular resolutions, however, thereceiving antenna elements 402 and the digital beamformer 316 can beused to generate thousands of narrow and steered beams (e.g., 2000beams, 4000 beams, or 6000 beams), such as the narrow receive beam 408.In this way, the radar system 104 can efficiently monitor the externalenvironment and accurately determine arrival angles of reflectionswithin the external environment.

Returning to FIG. 3, the transceiver 306 includes circuitry and logicfor transmitting and receiving radar signals via the antenna array 304.Components of the transceiver 306 can include amplifiers, mixers,switches, analog-to-digital converters, filters, and so forth forconditioning the radar signals. The transceiver 306 can also includelogic to perform in-phase/quadrature (I/Q) operations, such asmodulation or demodulation. The transceiver 306 can be configured forcontinuous wave radar operations or pulsed radar operations. A varietyof modulations can be used to produce the radar signals, includinglinear frequency modulations, triangular frequency modulations, steppedfrequency modulations, or phase modulations.

The transceiver 306 can generate radar signals within a range offrequencies (e.g., a frequency spectrum), such as between 1 gigahertz(GHz) and 400 GHz, between 4 GHz and 100 GHz, or between 57 GHz and 63GHz. The frequency spectrum can be divided into multiple sub-spectrathat have a similar bandwidth or different bandwidths. The bandwidthscan be on the order of 500 megahertz (MHz), 1 GHz, 2 GHz, and so forth.As an example, different frequency sub-spectra may include frequenciesbetween approximately 57 GHz and 59 GHz, 59 GHz and 61 GHz, or 61 GHzand 63 GHz. Multiple frequency sub-spectra that have a same bandwidthand may be contiguous or non-contiguous may also be chosen forcoherence. The multiple frequency sub-spectra can be transmittedsimultaneously or separated in time using a single radar signal ormultiple radar signals. The contiguous frequency sub-spectra enable theradar signal to have a wider bandwidth while the non-contiguousfrequency sub-spectra can further emphasize amplitude and phasedifferences that enable the angle estimator 318 to resolve angularambiguities. The attenuation mitigator 314 or the angle estimator 318may cause the transceiver 306 to utilize one or more frequencysub-spectra to improve performance of the radar system 104, as furtherdescribed with respect to FIGS. 5 and 6.

The power manager 320 enables the radar system 104 to conserve powerinternally or externally within the electronic device 102. Internally,for example, the power manager 320 can cause the radar system 104 tocollect data using a predefined power mode or a specific duty cycle.Instead of operating at either a low-power mode or a high-power mode,the power manager 320 dynamically switches between different power modessuch that response delay and power consumption are managed togetherbased on the activity within the environment. In general, the powermanager 320 determines when and how power can be conserved, andincrementally adjusts power consumption to enable the radar system 104to operate within power limitations of the electronic device 102. Insome cases, the power manager 320 may monitor an amount of availablepower remaining and adjust operations of the radar system 104accordingly. For example, if the remaining amount of power is low, thepower manager 320 may continue operating at the low-power mode insteadof switching to the higher power mode.

The low-power mode, for example, may use a low duty cycle on the orderof a few hertz (e.g., approximately 1 Hz or less than 5 Hz), whichreduces power consumption to a few milliwatts (mW) (e.g., betweenapproximately 2 mW and 5 mW). The high-power mode, on the other hand,may use a high duty cycle on the order of tens of hertz (Hz) (e.g.,approximately 20 Hz or greater than 10 Hz), which causes the radarsystem 104 to consume power on the order of several milliwatts (e.g.,between approximately 8 mW and 20 mW). While the low-power mode can beused to monitor the external environment or detect an approaching user,the power manager 320 may switch to the high-power mode if the radarsystem 104 determines the user is starting to perform a gesture.Different triggers may cause the power manager 320 to switch between thedifferent power modes. Example triggers include motion or the lack ofmotion, appearance or disappearance of the user, the user moving into orout of a designated region (e.g., a region defined by range, azimuth, orelevation), a change in velocity of a motion associated with the user,or a change in reflected signal strength (e.g., due to changes in radarcross section). In general, the triggers that indicate a lowerprobability of the user interacting with the electronic device 102 or apreference to collect data using a longer response delay may cause alower-power mode to be activated to conserve power.

The power manager 320 can also conserve power by turning off one or morecomponents within the transceiver 306 (e.g., a voltage-controlledoscillator, a multiplexer, an analog-to-digital converter, a phase lockloop, or a crystal oscillator) during inactive time periods. Theseinactive time periods occur if the radar system 104 is not activelytransmitting or receiving radar signals, which may be on the order ofmicroseconds (μs), milliseconds (ms), or seconds (s). Additionally, thepower manager 320 can control the use of different hardware componentswithin the radar system 104 to conserve power. If the processor 308comprises a low-power processor and a high-power processor (e.g.,processors with different amounts of memory and computationalcapability), for example, the power manager 320 can switch betweenutilizing the low-power processor for low-level analysis (e.g.,detecting motion, determining a location of a user, or monitoring theenvironment) and the high-power processor for situations in whichhigh-fidelity or accurate radar data is requested by the radar-basedapplication 106 (e.g., for gesture recognition or user orientation).

In addition to the internal power-saving techniques described above, thepower manager 320 can also conserve power within the electronic device102 by activating or deactivating other external components or sensorsthat are within the electronic device 102. These external components mayinclude speakers, a camera sensor, a global positioning system, awireless communication transceiver, a display, a gyroscope, or anaccelerometer. Because the radar system 104 can monitor the environmentusing a small amount of power, the power manager 320 can appropriatelyturn these external components on or off based on where the user islocated or what the user is doing. In this way, the electronic device102 can seamlessly respond to the user and conserve power without theuse of automatic shut-off timers or the user physically touching orverbally controlling the electronic device 102.

FIG. 5 illustrates additional details of an example implementation 500of the radar system 104 within the electronic device 102. In the example500, the antenna array 304 is positioned underneath an exterior housingof the electronic device 102, such as a glass cover or an external case.Depending on its material properties, the exterior housing may act as anattenuator 502, which attenuates or distorts radar signals that aretransmitted and received by the radar system 104. The attenuator 502 mayinclude different types of glass or plastics, some of which may be foundwithin display screens, exterior housings, or other components of theelectronic device 102 and have a dielectric constant (e.g., relativepermittivity) between approximately four and ten. Accordingly, theattenuator 502 is opaque or semi-transparent to a radar signal 506 andmay cause a portion of a transmitted or received radar signal 506 to bereflected (as shown by a reflected portion 504). For conventionalradars, the attenuator 502 may decrease an effective range that can bemonitored, prevent small targets from being detected, or reduce overallaccuracy.

Assuming a transmit power of the radar system 104 is limited, andre-designing the exterior housing is not desirable, one or moreattenuation-dependent properties of the radar signal 506 (e.g., afrequency sub-spectrum 508 or a steering angle 510) orattenuation-dependent characteristics of the attenuator 502 (e.g., adistance 512 between the attenuator 502 and the radar system 104 or athickness 514 of the attenuator 502) are adjusted to mitigate theeffects of the attenuator 502. Some of these characteristics can be setduring manufacturing or adjusted by the attenuation mitigator 314 duringoperation of the radar system 104. The attenuation mitigator 314, forexample, can cause the transceiver 306 to transmit the radar signal 506using the selected frequency sub-spectrum 508 or the steering angle 510,cause a platform to move the radar system 104 closer or farther from theattenuator 502 to change the distance 512, or prompt the user to applyanother attenuator to increase the thickness 514 of the attenuator 502.

Appropriate adjustments can be made by the attenuation mitigator 314based on pre-determined characteristics of the attenuator 502 (e.g.,characteristics stored in the computer-readable media 204 of theelectronic device 102 or within the system media 310) or by processingreturns of the radar signal 506 to measure one or more characteristicsof the attenuator 502. Even if some of the attenuation-dependentcharacteristics are fixed or constrained, the attenuation mitigator 314can take these limitations into account to balance each parameter andachieve a target radar performance. As a result, the attenuationmitigator 314 enables the radar system 104 to realize enhanced accuracyand larger effective ranges for detecting and tracking the user that islocated on an opposite side of the attenuator 502. These techniquesprovide alternatives to increasing transmit power, which increases powerconsumption of the radar system 104, or changing material properties ofthe attenuator 502, which can be difficult and expensive once a deviceis in production.

FIG. 6 illustrates an example scheme 600 implemented by the radar system104. Portions of the scheme 600 may be performed by the processor 308,the computer processors 202, or other hardware circuitry. The scheme 600can be customized to support different types of electronic devices 102and radar-based applications 106, and also enables the radar system 104to achieve target angular accuracies despite design constraints.

The transceiver 306 produces raw data 602 based on individual responsesof the receiving antenna elements 402 to a received radar signal. Thereceived radar signal may be associated with one or more frequencysub-spectra 604 that were selected by the angle estimator 318 tofacilitate angular ambiguity resolution. The frequency sub-spectra 604,for example, may be chosen to reduce a quantity of sidelobes or reducean amplitude of the sidelobes (e.g., reduce the amplitude by 0.5 dB, 1dB, or more). A quantity of frequency sub-spectra can be determinedbased on a target angular accuracy or computational limitations of theradar system 104.

The raw data 602 contains digital information (e.g., in-phase andquadrature data) for a period of time, different wavenumbers, andmultiple channels respectively associated with the receiving antennaelements 402. A Fast-Fourier Transform (FFT) 606 is performed on the rawdata 602 to generate pre-processed data 608. The pre-processed data 608includes digital information across the period of time, for differentranges (e.g., range bins), and for the multiple channels. A Dopplerfiltering process 610 is performed on the pre-processed data 608 togenerate range-Doppler data 612. The Doppler filtering process 610 maycomprise another FFT that generates amplitude and phase information formultiple range bins, multiple Doppler frequencies, and for the multiplechannels. The digital beamformer 316 produces beamforming data 614 basedon the range-Doppler data 612. The beamforming data 614 contains digitalinformation for a set of azimuths and/or elevations, which representsthe field of view for which different steering angles or beams areformed by the digital beamformer 316. Although not depicted, the digitalbeamformer 316 may alternatively generate the beamforming data 614 basedon the pre-processed data 608 and the Doppler filtering process 610 maygenerate the range-Doppler data 612 based on the beamforming data 614.To reduce a quantity of computations, the digital beamformer 316 mayprocess a portion of the range-Doppler data 612 or the pre-processeddata 608 based on a range, time, or Doppler frequency interval ofinterest.

The digital beamformer 316 can be implemented using a single-lookbeamformer 616, a multi-look interferometer 618, or a multi-lookbeamformer 620. In general, the single-look beamformer 616 can be usedfor deterministic objects (e.g., point-source targets having a singlephase center). For non-deterministic targets (e.g., targets havingmultiple phase centers), the multi-look interferometer 618 or themulti-look beamformer 620 are used to improve accuracies relative to thesingle-look beamformer 616. Humans are an example of a non-deterministictarget and have multiple phase centers 622 that can change based ondifferent aspect angles, as shown at 624-1 and 624-2. Variations in theconstructive or destructive interference generated by the multiple phasecenters 622 can make it challenging for conventional radars toaccurately determine angular positions. The multi-look interferometer618 or the multi-look beamformer 620, however, perform coherentaveraging to increase an accuracy of the beamforming data 614. Themulti-look interferometer 618 coherently averages two channels togenerate phase information that can be used to accurately determine theangular information. The multi-look beamformer 620, on the other hand,can coherently average two or more channels using linear or non-linearbeamformers, such as Fourier, Capon, multiple signal classification(MUSIC), or minimum variance distortion less response (MVDR). Theincreased accuracies provided via the multi-look beamformer 620 or themulti-look interferometer 618 enable the radar system 104 to recognizesmall gestures or distinguish between multiple portions of the user.

The angle estimator 318 analyzes the beamforming data 614 to estimateone or more angular positions. The angle estimator 318 may utilizesignal processing techniques, pattern matching techniques, or machinelearning. The angle estimator 318 also resolves angular ambiguities thatmay result from a design of the radar system 104 or the field of viewthe radar system 104 monitors. An example angular ambiguity is shownwithin an amplitude plot 626 (e.g., amplitude response).

The amplitude plot 626 depicts amplitude differences that can occur fordifferent angular positions of the target and for different steeringangles 510. A first amplitude response 628-1 (illustrated with a solidline) is shown for a target positioned at a first angular position630-1. Likewise, a second amplitude response 628-2 (illustrated with adotted-line) is shown for the target positioned at a second angularposition 630-2. In this example, the differences are considered acrossangles between −180 degrees and 180 degrees.

As shown in the amplitude plot 626, an ambiguous zone exists for the twoangular positions 630-1 and 630-2. The first amplitude response 628-1has a highest peak at the first angular position 630-1 and a lesser peakat the second angular position 630-2. While the highest peak correspondsto the actual position of the target, the lesser peak causes the firstangular position 630-1 to be ambiguous because it is within somethreshold for which conventional radars may be unable to confidentlydetermine whether the target is at the first angular position 630-1 orthe second angular position 630-2. In contrast, the second amplituderesponse 628-2 has a lesser peak at the second angular position 630-2and a higher peak at the first angular position 630-1. In this case, thelesser peak corresponds to target's location.

While conventional radars may be limited to using a highest peakamplitude to determine the angular positions, the angle estimator 318instead analyzes subtle differences in shapes of the amplitude responses628-1 and 628-2. Characteristics of the shapes can include, for example,roll-offs, peak or null widths, an angular location of the peaks ornulls, a height or depth of the peaks and nulls, shapes of sidelobes,symmetry within the amplitude response 628-1 or 628-2, or the lack ofsymmetry within the amplitude response 628-1 or 628-2. Similar shapecharacteristics can be analyzed in a phase response, which can provideadditional information for resolving the angular ambiguity. The angleestimator 318 therefore maps the unique angular signature or pattern toan angular position.

The angle estimator 318 can include a suite of algorithms or tools thatcan be selected according to the type of electronic device 102 (e.g.,computational capability or power constraints) or a target angularresolution for the radar-based application 106. In some implementations,the angle estimator 318 can include a neural network 632, aconvolutional neural network (CNN) 634, or a long short-term memory(LSTM) network 636. The neural network 632 can have various depths orquantities of hidden layers (e.g., three hidden layers, five hiddenlayers, or ten hidden layers) and can also include different quantitiesof connections (e.g., the neural network 632 can comprise afully-connected neural network or a partially-connected neural network).In some cases, the CNN 634 can be used to increase computational speedof the angle estimator 318. The LSTM network 636 can be used to enablethe angle estimator 318 to track the target. Using machine learningtechniques, the angle estimator 318 employs non-linear functions toanalyze the shape of the amplitude response 628-1 or 628-2 and generateangular probability data 638, which indicates a likelihood that the useror a portion of the user is within an angular bin. The angle estimator318 may provide the angular probability data 638 for a few angular bins,such as two angular bins to provide probabilities of a target being tothe left or right of the electronic device 102, or for thousands ofangular bins (e.g., to provide the angular probability data 638 for acontinuous angular measurement).

Based on the angular probability data 638, a tracker module 640 producesangular position data 642, which identifies an angular location of thetarget. The tracker module 640 may determine the angular location of thetarget based on the angular bin that has a highest probability in theangular probability data 638 or based on prediction information (e.g.,previously-measured angular position information). The tracker module640 may also keep track of one or more moving targets to enable theradar system 104 to confidently distinguish or identify the targets.Other data can also be used to determine the angular position, includingrange, Doppler, velocity, or acceleration. In some cases, the trackermodule 640 can include an alpha-beta tracker, a Kalman filter, amultiple hypothesis tracker (MHT), and so forth.

A quantizer module 644 obtains the angular position data 642 andquantizes the data to produce quantized angular position data 646. Thequantization can be performed based on a target angular resolution forthe radar-based application 106. In some situations, fewer quantizationlevels can be used such that the quantized angular position data 646indicates whether the target is to the right or to the left of theelectronic device 102 or identifies a 90-degree quadrant the target islocated within. This may be sufficient for some radar-based applications106, such as user proximity detection. In other situations, a largernumber of quantization levels can be used such that the quantizedangular position data 646 indicates an angular position of the targetwithin an accuracy of a fraction of a degree, one degree, five degrees,and so forth. This resolution can be used for higher-resolutionradar-based applications 106, such as gesture recognition. In someimplementations, the digital beamformer 316, the angle estimator 318,the tracker module 640, and the quantizer module 644 are togetherimplemented in a single machine learning module.

These and other capabilities and configurations, as well as ways inwhich entities of FIG. 1-6 act and interact, are set forth below. Thedescribed entities may be further divided, combined, used along withother sensors or components, and so on. In this way, differentimplementations of the electronic device 102, with differentconfigurations of the radar system 104 and non-radar sensors, can beused to implement radar-based gesture enhancement for voice interfaces.The example operating environment 100 of FIG. 1 and the detailedillustrations of FIGS. 2-6 illustrate but some of many possibleenvironments and devices capable of employing the described techniques.

Example Systems

As noted with respect to FIG. 1, the techniques and systems describedherein can also enable the electronic device 102 to generate and use avirtual map of an environment in which the electronic device 102 isoperating. The virtual map allows the electronic device 102 todistinguish between one or more devices in the environment that can beinteracted with using a voice interface.

FIG. 7 illustrates another example environment 700 in which techniquesenabling radar-based gesture enhancement for voice interfaces can beimplemented. The example operating environment 700 includes theelectronic device 102, which includes, or is associated with, the radarsystem 104, the microphone 108, and a radar-based application 702. Inthe example environment 700, the radar system 104 provides the radarfield 110 as described above with reference to FIGS. 1-6. The radarsystem 104 also enables the electronic device 102, in this case thesmartphone 102-1, to sense and analyze reflections from an object 704 inthe radar field 110. As shown in FIG. 7, the object 704 is a person or auser of the smartphone 102-1 (person 704 or user 704), but the object704 may be any of a variety of objects that the radar system 104 cansense and analyze reflections from, such as wood, plastic, metal,fabric, or organic material (e.g., the user 704). Based on the analysisof the reflections, the radar system 104 can provide radar data asdescribed above with reference to FIGS. 1-6 and may pass the radar datato other entities, such as the radar-based application 702.

The radar-based application 702 may be any of a variety of radar-basedapplications that can receive voice commands or instructions, such asthe radar-based application 106. In some implementations, theradar-based application 702 may also include a voice interface module,such as the voice interface module 120, to receive and process voicecommands. The radar-based application 702 can use the voice commands orinstructions to interact with the electronic device 102 or with avariety of other devices, such as home appliances, security systems,entertainment systems, lights (e.g., the lamp 114), or aninternet-of-things (TOT) device. The radar-based application 702 mayinclude a 3D gesture module, such as the 3D gesture module 116, whichcan be used to determine, at least in part, 3D gestures based on theradar data and actions that correspond to the 3D gestures.

The radar-based application 702 can also generate and store (e.g., in amemory device included or associated with the electronic device 102) avirtual map of an environment. The environment may be a room, abuilding, or another environment. The virtual map identifies a locationand a type of one or more devices in the environment that can beinteracted with via the radar-based application. For example, thevirtual map for the environment 700 may include a type of device (e.g.,the lamp 114, which may be dimmable or not dimmable) and a location ofthe lamp 114 in the environment 700. The virtual map may also include alocation for one or more types of speakers (e.g., left-, right-, andcenter-channel speakers) or media devices (e.g., the location of astereo, a video player, or a satellite receiver box).

The virtual map may be generated by any of a variety of entities, suchas the radar-based application 702, the radar system 104, or theradar-based application 106. In some implementations, the electronicdevice 102 includes an image-capture device (e.g., a video camera) andthe radar-based application 702 may generate the virtual map based on ascan of the environment by the image-capture device. The scan may beperformed manually by the user 704 or automatically by the electronicdevice 102. For example, the user can move the image-capture devicearound to scan the room for devices that can be interacted with via theradar-based application 702 or the image-capture device can beintegrated with the electronic device 102 and perform the scanautomatically. The image-capture device may enter a non-operational modewhen the scan is complete, such as by a manual shutoff by the user 704or, in response to a determination that the image-capture device hasstopped moving, the image-capture can automatically enter thenon-operational mode. In the case of automatic entry in thenon-operational mode, the image capture device may remain operation fora threshold period of time after the determination that it has stoppedmoving (e.g., 1, 3, or 5 seconds).

In other implementations, the radar-based application 702 (or anotherentity) may generate the virtual map by receiving an identificationsignal from the devices that can be interacted with by the radar-basedapplication 702. The identification signal includes the location and thetype of the devices, and may also include other information related tothe interactions, such as device settings and signal strength. Theradar-based application 702 can generate the virtual map using thelocation of the electronic device 102 and the information in theidentification signal. In some cases, the locations may be determinedusing various position determination techniques, such as received signalstrength indication (RSSI), fingerprinting, angle of arrival (AoA) andtime of flight (ToF) based techniques, along with trilateration ortriangulation algorithms. The identification signal may be received overa network, including any of the networks described with reference toFIGS. 2-6 and 13.

In yet other implementations, the radar-based application 702 (oranother entity) may generate the virtual map by receiving input from theuser 704 that identifies the devices that can be interacted with by theradar-based application. For example, the user 704 can manually inputthe location and the type of the devices that can be interacted with bydetecting the devices with the radar-based application 702 and thentouching or pointing at selected devices and providing information suchas the type (overhead light, floor lamp, main speakers, and so forth).The input may include text input or voice input.

The radar-based application 702 can enable the user 704 to interact withthe devices using a combination of voice and gesture commands. Forexample, the radar-based application 702, or the voice interface module120, can receive a voice command (e.g., via the microphone 108),directed to at least one of the identified devices, that includes atleast the type of the device. Once the voice command is received, theradar-based application 702 can determine, based on the radar data, a 3Dgesture that corresponds to a sub-command that is directed to the samedevices as the voice command and is related to the voice command. Thesub-command is a command that is related to the voice command by addingto, restricting, directing, fine-tuning, or otherwise modifying orsupplementing the voice command. In response to both the voice commandand the 3D gesture, the radar-based application 702 causes the device,or devices, to perform an action that corresponds to the voice commandand the sub-command.

For example, the user 704 can give a voice command, shown in FIG. 7 as atext bubble (voice command 706), to “Dim the light.” After, or atapproximately a same time as, giving the voice command 706, the user 704can provide a 3D gesture, shown in FIG. 7 by a dashed-line arrow (3Dgesture 708), indicating which light to dim. In FIG. 7, the 3D gesture708 is a rotation of the user's 704 arm up to point at the lamp 114.Using the virtual map, as described in this document, the radar-basedapplication 702 can determine that the user 704 intends to dim the lamp114. Thus, in this example, the sub-command corresponding to the 3Dgesture 708 specifies, based on the identified location, a particularone of the devices that can be interacted with.

In another example, the voice command is a command to adjust a functionof at least one of the identified devices, and the sub-command thatcorresponds to the 3D gesture specifies an amount of adjustment to thefunction of the identified device. For example, the voice command may be“turn the center-channel speaker down.” The 3D gesture may be the user704 grasping a virtual dial and rotating the dial in a direction thatreduces the volume of the center-channel speaker (e.g., in proportion toan amount of rotation).

In some implementations, including any of the examples described above,the radar-based application 702 may be included in, or be incommunication with, an interaction manager 710. The interaction manager710 can be any of a variety of controllers, modules, or managers thatcan control or coordinate the functions of the radar-based application702, the 3D gesture module 116, and the voice interface module 120. Asshown in FIG. 7, the interaction manager 710 is integrated with theelectronic device 102. In other implementations, however, the electronicdevice 102, the interaction manager 710. The radar-based application702, the 3D gesture module 116, and the voice interface module 120 maybe separate from each other or combined in a variety of manners.

In still other implementations, including any of the examples describedabove, the 3D gesture (sub-command) is effective if received after thevoice command. In other implementations, the 3D gesture may be effectiveif received either later than, or at approximately a same time as, thevoice command. In this way, the described techniques enable the user 704to use a simple voice command, along with a natural gesture, to quicklyand easily perform a task.

Example Methods

FIGS. 8 and 9 depict an example method 800 enabling radar-based gestureenhancement for voice interfaces. The method 800 can be performed withan electronic device that includes a microphone and uses a radar systemto provide a radar field. The radar field is used to detect a 3D gestureby an object in the radar field and determine that the 3D gesture is avoice input trigger that causes the microphone to exit a non-operationalmode and enter an operational mode. The method 800 is shown as a set ofblocks that specify operations performed but are not necessarily limitedto the order or combinations shown for performing the operations by therespective blocks. Further, any of one or more of the operations may berepeated, combined, reorganized, or linked to provide a wide array ofadditional and/or alternate methods. In portions of the followingdiscussion, reference may be made to the example operating environment100 of FIG. 1 or to entities or processes as detailed in FIGS. 2-7,reference to which is made for example only. The techniques are notlimited to performance by one entity or multiple entities operating onone device.

At 802, a radar field is provided. This radar field can be provided byany of a variety of electronic devices (e.g., the electronic device 102described above), that include a microphone (e.g., the microphone 108),a radar system (e.g., the radar system 104), and a radar-basedapplication (e.g., the radar-based applications 106 or 702, includingthe voice interface module 120). Further, the radar field may be any ofa variety of types of radar fields, such as the radar field 110described above.

At 804, reflections from an object in the radar field are sensed by theradar system. The object may be any of a variety of objects, such aswood, plastic, metal, fabric, or organic material. For example, theobject may be a person or a body part of a person (e.g., a hand), suchas the objects 112 or 704 as described above.

At 806, the reflections from the object in the radar field are analyzed.The analysis may be performed by any of a variety of entities (e.g., theradar system 104 or any of the radar-based applications describedherein) and may include various operations or determinations, such asthose described with reference to FIGS. 3-6.

At 808, based on the analysis of the reflection, radar data, such as theradar data described above, is provided. The radar data may be providedby any of a variety of entities, such as the radar system 104 or any ofthe radar-based applications described herein. In some implementations,the radar system may provide the radar data and pass the radar data toother entities (e.g., any of the described radar-based applications,interaction managers, or voice interface modules). The description ofthe method 800 continues in FIG. 9, as indicated by the letter “A” afterblock 808 of FIG. 8, which corresponds to the letter “A” before block810 of FIG. 9.

At 810, the radar-based application maintains the microphone in anon-operational mode. As noted with reference to FIG. 1, thenon-operational mode can be a mode in which the microphone is poweredoff and cannot receive, analyze, or record audio input. In otherimplementations, the non-operational mode may be a mode in which themicrophone is connected to power and can receive audio input, but cannotbe used to record, analyze, or otherwise act on the audio input. Asnoted, the powered non-operational mode may be achieved using variousmethods (e.g., a software or firmware control signal or a hardwarecontrol, such as a switch) that prohibit data transfer from themicrophone 108 to memory devices, processors, or other devices.

At 812, based on the radar data, the radar-based application detects a3D gesture by the object in the radar field. As noted, a 3D gesture is agesture that comprises one or more movements, in any direction, within a3D space illuminated by the radar field.

At 814, the radar-based application determines, based on the radar data,that the 3D gesture is a voice input trigger. In some implementations,the radar-based application can detect the 3D gesture, and make thedetermination that the 3D gesture is the voice input trigger, using a 3Dgesture module (e.g., the 3D gesture module 116). As noted, the voiceinput trigger is an indication to the radar-based application that itmay receive voice input (or voice commands). As described with referenceto FIGS. 3-6, the radar system can use the radar field to sense andanalyze reflections from objects in the radar field 110 in ways thatenable high, or increased, resolution and accuracy for both 3D gesturerecognition and body posture. Thus, the voice input trigger gesture maytake a variety of forms, such as a 3D hand gesture, a micro-gesture, ora movement of the object in the radar field to within a thresholddistance of the electronic device. As noted with reference to FIG. 1,the voice input trigger gesture may be a predefined 3D gesture, a 3Dgesture selected from a list, or a custom gesture. Unless indicatedotherwise by a particular context, increased accuracy refers to anincreased degree of refinement, an increased conformity to truth, orboth the increased degree of refinement and the increased conformity totruth.

At 816, in response to determining that the 3D gesture is the voiceinput trigger, the radar-based application causes the microphone toenter an operational mode that enables the microphone to receive,analyze, and act on, voice commands or other audio input. In someimplementations, the radar-based application can also cause themicrophone to enter, or re-enter, the non-operational mode when theradar-based application does not receive a voice or other audio inputwithin a threshold time of receiving the voice input trigger.

Consider, for example, FIG. 10, which illustrates an exampleimplementation 1000 of an electronic device that can implementadditional details of the method 800. FIG. 10 depicts the electronicdevice (in this case, the smartphone 102-1), including the radar system,which is providing the radar field 110, the radar-based application, andthe microphone. In the example implementation 1000, assume that theradar-based application is maintaining the microphone in thenon-operational mode when a user 1002 makes a tapping gesture 1004 on asurface of a table near the electronic device. The radar-basedapplication, based on the radar data, determines that the gesture 1004is the voice input trigger and causes the microphone to enter theoperational mode. Without having to first touch or speak to thesmartphone 102-1, the user 1002 can now use voice commands to interactwith the smartphone 102-1 and with other devices that are subject tocontrol by the radar-based application.

As noted, in some implementations of the method 800, the radar-basedapplication includes, or is communication with, a voice interface module(e.g., the voice interface module 120). The voice interface module canreceive a voice input (e.g., a voice command), determine an action thatcorresponds to the voice input, and cause the electronic device toperform the corresponding action. The voice interface module may also beused to maintain the microphone in the non-operational mode. The voiceinterface module may be integrated with the radar-based application, orbe a separate entity that is part of, or separate from, the electronicdevice 102. In this way, the radar-based application is enabled to useboth a voice command and a gesture input to interact with anotherdevice.

For example, once the microphone enters the operational mode, theradar-based application (e.g., the voice interface module 120) canreceive a voice command that is directed to a device that can beinteracted with by the radar-based application. Once the voice commandis received, the radar-based application can receive a 3D gesture thatspecifies a particular device. To be effective, the 3D gesture isreceived within a threshold time of, or approximately simultaneouslywith, receiving the voice command (e.g., the threshold time may be 0.5,1.5, or 2.5 seconds). The voice command can be any of a variety ofcommands, such as “turn lights down” or “turn speakers up.” The 3Dgesture is a 3D gesture, such as a pointing gesture, that specifieswhich lights to turn down or a specific speaker to turn up. Theradar-based application can distinguish between devices in a variety ofmanners, such as by using a virtual map of an environment in which theradar-based application is operating, as described with reference toFIG. 7.

By way of further example, consider FIG. 11, which illustrates anexample implementation 1100 that describes additional details regardingthe method 800. In the example implementation 1100, a user 1102 isinteracting with the electronic device 102 (in this case, the smartphone102-1), and the radar system is providing the radar field 110. Assume,for this example, that the microphone has already entered theoperational mode, as described above. The radar-based application (e.g.,the voice interface module 120) can now receive voice commands from theuser 1102 to adjust a function of a device that can be interacted withvia the radar-based application, such as a lamp 1104. For example, theuser can provide a voice command 1106, such as “lights up” (shown inFIG. 11 as a text balloon), to increase a brightness of the lamp 1104.Once the voice command 1106 is received, the radar-based application canreceive a 3D gesture that specifies an amount of adjustment to thefunction of the device. In FIG. 11, the 3D gesture is an upward handgesture, as indicated by a directional arrow 1108. As noted in theprevious example, effective, the 3D gesture is received within athreshold time of, or approximately simultaneously with, receiving thevoice command (e.g., the threshold time may be 0.5, 1.5, or 2.5seconds).

Note that the user 1102 may continue to adjust the brightness of thelamp 1104 by continuing the upward gesture to keep increasing thebrightness or by a downward gesture (as shown by an arrow 1110) todecrease the brightness. Thus, the electronic device 102, along with theradar system and the radar-based application, work together to enableusers of voice interfaces to efficiently and conveniently use both voicecommands and 3D gestures to make adjustments to functions of devicesthat can be interacted with via a voice interface.

Consider another example, shown in FIG. 12, which illustrates an exampleimplementation 1200 that describes additional details regarding themethod 800. In the example implementation 1200, a user 1202 isinteracting with the smartphone 102-1, and the radar system is providingthe radar field 110. Assume, for this example, that the microphone hasalready entered the operational mode and, in response to a voice commandor other event (e.g., a scheduled meeting reminder), the radar-basedapplication, or the voice interface module 120, is providing an audiomessage 1204 (shown as a text balloon). In the example implementation1200, the audio message 1204 is a notification (“Calling area code . . .”) that the smartphone 102-1 is placing, or is about to place, a phonecall. Further assume that the user 1202 does not want to hear the audiomessage 1204, and provides a 3D gesture that corresponds to a command tostop the audio message 1204. In the example implementation 1200, the 3Dgesture is a right-to-left swipe near the smartphone 102-1, as shown byan arrow 1206 (3D gesture 1206). When the radar-based applicationreceives the 3D gesture 1206, the audio message is stopped.

The 3D gesture 1206 may be effective if received within a duration ofthe audio message (e.g., while the audio message is playing), or beforethe audio message begins (e.g., between the input that causes the audiomessage and the beginning of the audio message). After the audio messageis stopped, the microphone may remain in the operational mode or,depending on the settings of the radar-based application, enter thenon-operational mode. In implementations in which the microphone remainsin the operational mode, the user 1202 may provide additional voicecommands. In this way, the user 1202 can correct the radar-basedapplication if, for example, the audio message 1204 is incorrect, orprovide a different or additional voice command.

It should be noted that these techniques for radar-based gestureenhancement for voice interfaces may be more secure than othertechniques. Not only are 3D gestures (especially user-defined gestures,micro-gestures, and posture- or position-based gestures) not typicallyobtainable by an unauthorized person (unlike, for example, a password),but also because a radar image of the user, even if it includes theuser's face, does not visually identify the user like a photographic orvideo does. Even so, further to the descriptions above, the user may beprovided with controls allowing the user to make an election as to bothwhether and when any of the systems, programs, modules, or featuresdescribed in this document may enable collection of user information(e.g., information about a user's social network, social actions oractivities, profession, a user's preferences, or a user's currentlocation), and whether the user is sent content or communications from aserver. In addition, certain data may be treated in one or more waysbefore it is stored or used, so that personally identifiable informationis removed. For example, a user's identity may be treated so that nopersonally identifiable information can be determined for the user, or auser's geographic location may be generalized where location informationis obtained (such as to a city, ZIP code, or state level), so that aparticular location of a user cannot be determined. Thus, the user mayhave control over what information is collected about the user, how thatinformation is used, and what information is provided to or about theuser.

Example Computing System

FIG. 13 illustrates various components of an example computing system1300 that can be implemented as any type of client, server, and/orelectronic device as described with reference to the previous FIGS. 1-12to implement radar-based gesture enhancement for voice interfaces.

The computing system 1300 includes communication devices 1302 thatenable wired and/or wireless communication of device data 1304 (e.g.,radar data, 3D gesture data, authentication data, reference data,received data, data that is being received, data scheduled forbroadcast, data packets of the data). The device data 1304 or otherdevice content can include configuration settings of the device, mediacontent stored on the device, and/or information associated with a userof the device (e.g., an identity of a person within a radar field).Media content stored on the computing system 1300 can include any typeof radar, biometric, audio, video, and/or image data. The computingsystem 1300 includes one or more data inputs 1306 via which any type ofdata, media content, and/or inputs can be received, such as humanutterances, interactions with a radar field, touch inputs,user-selectable inputs (explicit or implicit), messages, music,television media content, recorded video content, and any other type ofaudio, video, and/or image data received from any content and/or datasource. The data inputs 1306 may include, for example, the radar basedapplications 106 and 702, the 3D gesture module 116, or the voiceinterface module 120.

The computing system 1300 also includes communication interfaces 1308,which can be implemented as any one or more of a serial and/or parallelinterface, a wireless interface, any type of network interface, a modem,and as any other type of communication interface. The communicationinterfaces 1308 provide a connection and/or communication links betweenthe computing system 1300 and a communication network by which otherelectronic, computing, and communication devices communicate data withthe computing system 1300.

The computing system 1300 includes one or more processors 1310 (e.g.,any of microprocessors, controllers, or other controllers) that canprocess various computer-executable instructions to control theoperation of the computing system 1300 and to enable techniques for, orin which can be implemented, radar-based gesture enhancement for voiceinterfaces. Alternatively or additionally, the computing system 1300 canbe implemented with any one or combination of hardware, firmware, orfixed logic circuitry that is implemented in connection with processingand control circuits, which are generally identified at 1312. Althoughnot shown, the computing system 1300 can include a system bus or datatransfer system that couples the various components within the device. Asystem bus can include any one or combination of different busstructures, such as a memory bus or memory controller, a peripheral bus,a universal serial bus, and/or a processor or local bus that utilizesany of a variety of bus architectures.

The computing system 1300 also includes computer-readable media 1314,such as one or more memory devices that enable persistent and/ornon-transitory data storage (i.e., in contrast to mere signaltransmission), examples of which include random access memory (RAM),non-volatile memory (e.g., any one or more of a read-only memory (ROM),flash memory, EPROM, EEPROM, etc.), and a disk storage device. A diskstorage device may be implemented as any type of magnetic or opticalstorage device, such as a hard disk drive, a recordable and/orrewriteable compact disc (CD), any type of a digital versatile disc(DVD), and the like. The computing system 1300 can also include a massstorage media device (storage media) 1316.

The computer-readable media 1314 provides data storage mechanisms tostore the device data 1304, as well as various device applications 1318and any other types of information and/or data related to operationalaspects of the computing system 1300. For example, an operating system1320 can be maintained as a computer application with thecomputer-readable media 1314 and executed on the processors 1310. Thedevice applications 1318 may include a device manager, such as any formof a control application, software application, signal-processing andcontrol modules, code that is native to a particular device, anabstraction module, a gesture recognition module, and other modules. Thedevice applications 1318 may also include system components, engines, ormanagers to implement radar-based gesture enhancement for voiceinterfaces, such as the radar system 104, the radar-based application106, the radar-based application 702, the 3D gesture module 116, thevoice interface module 120, or the interaction manager 710. Thecomputing system 1300 may also include, or have access to, one or moremachine learning systems.

CONCLUSION

Although implementations of techniques for, and apparatuses enabling,radar-based gesture enhancement for voice interfaces have been describedin language specific to features and/or methods, it is to be understoodthat the subject of the appended claims is not necessarily limited tothe specific features or methods described. Rather, the specificfeatures and methods are disclosed as example implementations enablingradar-based gesture enhancement for voice interfaces.

What is claimed is:
 1. An electronic device, comprising: a microphone; aradar system, implemented at least partially in hardware, configured to:provide a radar field; sense reflections from an object in the radarfield; analyze the reflections from the object in the radar field; andprovide, based on the analysis of the reflections, radar data; one ormore computer processors; and one or more computer-readable media havinginstructions stored thereon that, responsive to execution by the one ormore computer processors, implement a radar-based application configuredto: maintain the microphone in a non-operational mode, thenon-operational mode comprising a mode in which audio input received bythe microphone cannot be recorded or acted upon; detect, based on theradar data, a gesture by the object in the radar field; determine, basedon the radar data, that the gesture is a voice input trigger; andresponsive to determining the gesture is the voice input trigger, causethe microphone to enter an operational mode.
 2. The electronic device ofclaim 1, wherein: the object in the radar field is a user; and the voiceinput trigger is a three-dimensional (3D) gesture by the user.
 3. Theelectronic device of claim 1, wherein the radar-based application isfurther configured to cause, responsive to not receiving a voice inputwithin a threshold time of receiving the voice input trigger, themicrophone to enter the non-operational mode.
 4. The electronic deviceof claim 1, wherein: the electronic device further comprises a voiceinterface module that is configured, responsive to the microphoneentering the operational mode, to receive, at a first time, a voicecommand, the voice command being a command directed to a device that canbe interacted with via the radar-based application; and the radar-basedapplication is further configured to receive, at a second time that islater than the first time and within a threshold time of the voiceinterface module receiving the voice command, a 3D gesture thatspecifies a particular device to which the voice command is directed. 5.The electronic device of claim 1, wherein: the electronic device furthercomprises a voice interface module that is configured, responsive to themicrophone entering the operational mode, to receive, at a first time, avoice command, the voice command being a command to adjust a function ofa device that can be interacted with via the radar-based application;and the radar-based application is further configured to receive, at asecond time that is later than the first time and within a thresholdtime of the voice interface module receiving the voice command, a 3Dgesture that specifies an amount of adjustment to the function of thedevice that can be interacted with via the radar-based application. 6.The electronic device of claim 1, wherein the radar-based application isfurther configured, responsive to the microphone entering theoperational mode, to: provide an audio message; receive, within aduration of the audio message, another 3D gesture, the other 3D gesturecorresponding to a command to cease providing the audio message; andresponsive to receiving the other 3D gesture, cease providing the audiomessage.
 7. The electronic device of claim 1, wherein the radar systemfurther comprises a digital beamformer and an angle estimator, and theradar system is configured to monitor angles in a field of view betweenapproximately −90 degrees and approximately 90 degrees.
 8. Theelectronic device of claim 1, wherein the electronic device comprises: atablet computing device; a wearable computing device; a smartphone; ahome appliance; or a home-automation system.
 9. The electronic device ofclaim 1, wherein the non-operational mode further comprises themicrophone being powered off.
 10. The electronic device of claim 1,wherein the non-operational mode further comprises the microphone beingpowered on, and the electronic device further comprises a controlmechanism configured to prohibit the microphone from transmitting theaudio input or data based on the audio input.
 11. The electronic deviceof claim 10, wherein the control mechanism further comprises: a softwarecontrol signal; a firmware control signal; a switch; a non-switchhardware control, or combinations thereof.
 12. A system, comprising: anelectronic device; a microphone; a radar system, implemented at leastpartially in hardware, configured to: provide a radar field; sensereflections from an object in the radar field; analyze the reflectionsfrom the object in the radar field; and provide, based on the analysisof the reflections, radar data; one or more computer processors; and oneor more computer-readable media having instructions stored thereon that,responsive to execution by the one or more computer processors,implement an interaction manager configured to: generate and store avirtual map of an environment, the virtual map comprising a location anda type of one or more devices in the environment that can be interactedwith via the interaction manager; receive, at a first time, a voicecommand directed to at least one identified device, the voice commandincluding at least the type of the at least one identified device;determine, at a second time that is later than, or at approximately asame time as, the first time, a three-dimensional (3D) gesture, thedetermination based on the radar data, the 3D gesture corresponding to asub-command, the sub-command related to the voice command; andresponsive to the voice command and the 3D gesture, cause the at leastone identified device to perform an action corresponding to the voicecommand and the sub-command.
 13. The system of claim 12, wherein: theelectronic device includes an image-capture device; and the interactionmanager is further configured to: generate the virtual map based on ascan of the environment by the image-capture device; and responsive to adetermination that the image-capture device has stopped moving, causethe image-capture device to enter a non-operational mode.
 14. The systemof claim 12, wherein the interaction manager is further configured togenerate the virtual map by: receiving, from the identified devices, anidentification signal, the identification signal including the locationand the type of the identified devices; or receiving, via input from auser, identification of the identified devices, the identificationincluding the location and the type of the identified devices.
 15. Thesystem of claim 12, wherein: the sub-command specifies, based on theidentified location, a particular one of the at least one identifieddevices; or the voice command is a command to adjust a function of theat least one identified device and the sub-command specifies an amountof adjustment to the function of the at least one identified device; orthe sub-command is a command that adds to, restricts, directs, orfine-tunes the voice command.
 16. The system of claim 12, wherein theelectronic device comprises: a tablet computing device; a wearablecomputing device; a smartphone; a home appliance; or a home-automationsystem.
 17. A method implemented in an electronic device that includes aradar system, a radar-based application, and a microphone, the methodcomprising: providing, by the radar system, a radar field; sensing, bythe radar system, reflections from an object in the radar field;analyzing the reflections from the object in the radar field; providing,based on the analysis of the reflections, radar data; maintaining, bythe radar-based application, the microphone in a non-operational mode,the non-operational mode comprising a mode in which audio input receivedby the microphone cannot be recorded or acted upon; detecting, based onthe radar data, a gesture by the object in the radar field; determining,based on the radar data, that the gesture is a voice input trigger; andresponsive to determining the gesture is the voice input trigger,causing the microphone to enter an operational mode.
 18. The method ofclaim 17, further comprising, responsive to not receiving a voice inputwithin a threshold time of receiving the voice input trigger, causingthe microphone to enter the non-operational mode.
 19. The method ofclaim 17, wherein the object in the radar field is a user and thegesture is a three-dimensional (3D) gesture by the user or a movement bythe user to within a threshold distance from the electronic device. 20.The method of claim 17, wherein the electronic device further comprisesa voice interface module, and the method further comprises, responsiveto the microphone entering the operational mode: receiving, at a firsttime and by the voice interface module, a voice command, the voicecommand being a command directed to a device that can be interacted withvia the radar-based application; and receiving, by the radar-basedapplication, at a second time that is later than the first time andwithin a threshold time of the voice interface module receiving thevoice command, a 3D gesture that specifies a particular device to whichthe voice command is directed.
 21. The method of claim 17, wherein theelectronic device further comprises a voice interface module, and themethod further comprises, responsive to the microphone entering theoperational mode: receiving, at a first time and by the voice interfacemodule, a voice command, the voice command being a command to adjust afunction of a device that can be interacted with via the radar-basedapplication; and receiving, by the radar-based application, at a secondtime that is later than the first time and within a threshold time ofthe voice interface module receiving the voice command, a 3D gesturethat specifies an amount of adjustment to the function of the devicethat can be interacted with via the radar-based application.
 22. Themethod of claim 17, wherein the non-operational mode further comprisesthe microphone being powered off.
 23. The method of claim 17, whereinthe non-operational mode further comprises the microphone being poweredon, and the electronic device further includes a control mechanismconfigured to prohibit the microphone from transmitting the audio inputor data based on the audio input.
 24. The method of claim 23, whereinthe control mechanism further comprises: a software control signal; afirmware control signal; a switch; a non-switch hardware control, orcombinations thereof.